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Abstract 
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to  be  unattainable  because  ultimately  the  concept  can  be  defined  only  in  intuitive 
terms;  and  (ii)  any  mathematical  measure  used  to  characterize  the  branching 
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In  physical  science  ...  the  most  important  and  most  fruitful  concepts  are  those 
to  which  it  is  impossible  to  attach  a  well-defined  meaning. 

H.A.  Kramers  (1947) 


Introduction 


In  chemistry,  frequent  use  is  made  of  a  number  of  concepts  which,  in  a  strictly 
mathematical  sense,  are  ill-defined.  Examples  include  the  concepts  of  aromaticity, 
complexity,  shape  and  structure,  all  of  which  have  been  widely  used  to  describe 
molecular  species,  yet  none  of  which  has  been  precisely  defined.  Although  this 
lack  of  precision  on  the  part  of  chemists  appears  not  to  have  seriously  impeded 
the  progress  of  chemistry  to  date,  there  are  signs  that  precise  definitions  of 
several  commonly  employed  concepts  could  make  an  important  contribution 
to  the  future  development  of  the  subject.  Accordingly,  we  shall  focus  here  on 
one  such  concept,  namely  the  concept  of  branching  in  molecular  species,  and 
explore  the  ways  in  which  it  has  been  approached  by  both  chemists  and 
mathematicians.  Graph-theoretical  ideas  would  appear  to  be  highly  relevant 
in  this  context,  for  the  problem  has  already  been  tackled  by  several  workers 
in  the  mathematical  literature  [17,32,83,90].  In  the  molecular  graphs  used  by 
chemists  to  represent  chemical  species,  branching  has  traditionally  been  considered 
to  occur  whenever  the  graphs  contained  at  least  one  vertex  having  a  valence 
greater  than  two.  Moreover,  the  higher  the  valence  of  the  vertices,  the  greater 
the  extent  of  branching  in  the  species  was  said  to  be.  This  notion  has  been 
formalized  in  terms  of  the  valence  partitioning  of  the  vertices  of  molecular 
graphs.  Nonisomorphic  graphs  having  identical  extents  of  branching  were  described 
[78]  as  differing  in  their  'branching  content'.  Before  pursuing  such  ideas  further 
here,  we  now  pause  to  introduce  some  necessary  chemical  terminology. 


For  well  over  a  century  it  has  been  known  [26]  that  two  chemical  compounds 
which  have  the  same  chemical  formula  may  differ  in  the  internal  arrangement 
of  their  atoms.  Two  such  compounds  are  referred  to  as  chemical  isomers;  isomers 
always  differ  from  one  another  in  at  least  one  of  their  physicochemical  properties. 
Overall,  isomers  have  been  classified  into  two  broad  categories  designated  as 
constitutional  isomers  and  stereoisomers.  Several  schemes  for  the  detailed 
classification  of  isomers  have  been  developed  in  recent  years  [8,15,83],  and  at 
least  30  different  subclasses  of  isomers  are  now  recognized  by  chemists  [72]. 
Computer  programs  for  the  enumeration  of  most  of  these  subclasses  are  also 
available  [43].  Our  interest  here  will  focus  only  on  the  former  category  of  isomers, 
i.e.  the  constitutional  isomers,  which  have  also  been  widely  referred  to  in  the 
past  as  structural  isomers  [35].  We  elect  not  to  use  this  latter  term,  however, 
since,  as  mentioned  above,  the  term  structure  is  ill-defined  in  the  chemical 
context,  and  in  certain  of  its  various  meanings  the  adjective  'structural'  has 
therefore  become  somewhat  ambiguous.  Constitutional  isomers  may  be  regarded 
as  discrete  molecular  entities  whose  atoms  are  bonded  together  and  held  at 
approximately  fixed  positions  in  space  relative  to  one  another  as  a  result  of  the 
constraints  imposed  upon  their  mutual  motions  by  the  bonding  interactions  [51]. 
A  pair  of  constitutional  isomers  must  differ  in  both  the  sequence  and  the  nature 
of  the  bonding  interactions  occurring  between  their  respective  atoms  [43]. 

In  the  mid-1850s  Cayley  [16]  first  depicted  the  constitutional  isomers  of  the 
members  of  certain  homologous  series,  namely  the  alkanes,  CnH2n+2/  and  mono- 
substituted  alkanes,  CnH2n+iX,  using  graphs.  He  clearly  established  that  a  pair 
of  constitutional  isomers  will  always  be  possessed  of  two  nonisomorphic  graphs 
and  that  there  is  a  1:1  correspondence  between  the  alkane  isomers  having  ji  atoms 
and  the  relevant  tree  graphs  on  n.  vertices.  The  relevant  tree  graphs  in  this  case 
are  allowed  to  have  a  maximum  vertex  degree  of  four.  Cayley  also  enumerated 
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the  isomers  for  the  first  several  members  of  each  series;  later  workers  have 
subsequently  corrected  (where  necessary)  and -substantially  extended  these  early 
results  [73,43].  We  shall  consider  only  alkane  species  here,  for  these  species 
conveniently  exemplify  the  nature  of  the  problems  we  propose  to  discuss  in  this 
paper.  The  numbers  of  isomers  for  several  different  members  of  the  alkane 
series  are  presented  in  Table  1.  Given  that  the  valence  of  the  carbon  atom  is 
four  and  that  of  the  hydrogen  atom  is  one,  it  is  easy  to  demonstrate  that  alkane 
species  contain  the  maximum  ratio  of  hydrogen  to  carbon  of  all  the  hydrocarbons 
[75].  For  our  purposes  it  will  be  sufficient  to  represent  the  alkanes  by  their  carbon 
backbones  and  to  ignore  the  hydrogen  atoms,  which  can  usually  be  inferred  without 
difficulty  and  which  in  any  case  are  nonessential  in  that  they  are  not  structure- 
determining.  Graphs  depicting  only  the  carbon  skeleton  of  hydrocarbon  species 
are  widely  used  in  mathematical  chemistry  and  are  referred  to  as 
hydrogen-supprassed  graphs.  £»!_€.  1 

In  this  first  comprehensive  review  on  the  mathematical  description  of  molecular 
branching,  we  shall  highlight  the  problem  of  characterizing  in  a  chemically 
meaningful  way  the  hydrogen-suppressed  graphs  of  members  of  the  alkane  series, 

CnH2n+2*  Ideally,  such  characterizations  should  satisfy  two  criteria,  viz.  (i) 
they  should  be  unique  in  purely  graph-theoretical  terms,  and  (ii)  they  should 
accurately  reflect  the  physicochemical  properties  of  the  species  being 
characterized.  It  is  fair  to  point  out  that  it  is  not  possible  to  satisfy  both  of 
these  criteria  simultaneously  at  present.  Although  it  is  certainly  feasible  to 
characterize  species  uniquely,  e.g.  by  means  of  their  adjacency  matrix  or  by 
some  appropriate  code  [82],  characterizations  of  this  kind  are  not  only  unwieldy 
but,  more  importantly,  they  usually  fail  to  provide  a  sufficiently  reliable 
description  of  the  physicochemical  and  other  properties.  On  the  other  hand, 
all  of  the  simple  numerical  descriptors  of  species  which  have  been  employed 


to  date  have  subsequently  been  shown  to  be  nonunique.  The  problem  chemists 
are  confronted  with  is  thus  a  challenging  one  and  no  completely  satisfactory 
solution  appears  to  be  in  sight.  Over  the  past  decade,  however,  steady  progress 
has  been  made  and  some  important  new  insights  ha'-e  been  gained.  It  is  our  purpose 
now  to  review  the  current  state  of  the  art  in  the  characterization  of  molecular 
branching  though,  in  an  effort  to  keep  the  number  of  literature  citations  down 
to  manageable  proportions,  only  key  references  will  be  given.  We  set  the  scene 
by  first  exploring  the  question  whether  it  is  feasible  to  attempt  to  characterize 
branching  in  purely  mathematical  terms. 


The  Measurement  of  Branching 

Virtually  all  of  the  physicochemical  properties  of  alkane  species  are  either 
greatly  influenced  by  or  substantially  dependent  upon  the  degree  of  branching 
present  in  their  constituent  molecules.  One  notable  example  of  such  a  property, 
which  has  very  important  commercial  implications,  is  the  octane  rating  assigned 
to  fuels  used  in  automobiles  and  other  vehicles.  In  effect,  the  octane  rating 
of  a  fuel  determines  its  quality,  for  the  higher  the  rating  the  less  likely  the  fuel 
will  be  to  self-ignite  upon  sudden  compression  in  air.  The  octane  rating  of  an 
alkane  fuel  is  directly  dependent  upon  the  amount  of  branching  present  in  its 
component  molecules  [33.  Even  from  this  isolated  example,  the  crucial  importance 
of  the  concept  of  branching  to  chemistry  should  be  evident.  What  chemists  lack, 
however,  is  some  effective  means  of  measuring  the  amount  of  branching  present 
in  molecules  based  on  some  universally  agreed  definition.  As  indicated  above, 
the  notion  of  branching  has  traditionally  been  described  in  purely  intuitive  terms 
[24],  such  as  the  number  of  vertices  of  degree  greater  than  two  in  the  chemical 
graph.  We  discuss  now  whether  it  is  possible  to  improve  upon  this  seemingly 
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unsatisfactory  method  of  interpreting  a  highly  important  molecular  property. 

One  approach  to  the  problem  favored  by  chemists  in  recent  years  has  been 
to  attempt  to  order  chemical  graphs  according  to  some  set  of  well-defined 
mathematical  criteria.  Once  such  an  ordering  has  been  achieved,  the  second 
issue  of  whether  the  ordering  matches  any  of  the  orderings  based  on  the  various 
physicochemical  properties  of  molecules  can  then  be  addressed.  Let  us  start 
with  a  simple  Gedankenexperiment.  If  we  consider  two  tree  graphs,  one  in  the 
form  of  a  path  and  the  other  in  the  form  of  a  star,  it  is  immediately  obvious 
which  of  the  two  is  more  branched.  Thus,  any  scheme  we  may  devise  to  order 
branched  molecular  species  must  always  give  precedence  to  the  star  graph  over 
the  path  graph.  When  comparisons  of  certain  other  pairs  of  tree  graphs  are  made, 
however,  intuition  is  no  longer  sufficient.  For  instance,  it  is  by  no  means  obvious 
which  of  the  three  graphs  illustrated  in  Figure  1  is  more  branched,  even  though 
the  molecules  they  represent  can  certainly  be  ordered  hierarchically  in  terms 
of  their  physicochemical  properties.  Numerous  other  equally  indeterminate 
examples  might  be  cited.  We  now  explore  the  contribution  which  ordering  can 


make  to  the  solution  of  problems  of  this  type,  bearing  in  mind  that  most  ordering 
procedures  merely  define  a  hierarchy  but  do  not  assign  absolute  values  to  the 


degree  of  branching  present  in  molecular  species. 


flCcUCfc  1 


Any  ordering  of  structures  necessarily  implies  that  comparisons  have  to  be 
made.  In  the  chemical  context,  the  comparisons  are  frequently  made  between 
sequences  of  numbers  which  are  used  to  identify  the  structures  they  represent. 
The  numbers  chosen  might  be  integers;  one  convenient  way  of  obtaining  these 
is  to  take  the  vertex  degrees  of  the  hydrogen-suppressed  graphs  arranged  as 


a  nonascending  sequence,  i.e.  vj  >  +  for  all  _i_  =  1,2,...,  (ri  -  1).  Two  sequences 

of  numbers  of  the  same  length  are  said  to  be  comparable  if  there  exists  an 
inequality  between  them  for  all  intervals  defined  by  the  values  of  the  variables. 
Comparability  can  be  tested  for  by  constructing  sequences  of  partial  sums.  To 
illustrate  this,  let  us  suppose  that  the  two  sequences  are  V  =  {vj}  and  V'  =  Jvj'}. 
Now,  for  all  the  vj  and  Vj'  these  sequences  will  be  comparable  only  if  V  >  V'  or 
V  <  V'  for  all  the  intervals.  Muirhead  [14,15]  defined  a  relative  ordering  for 
such  sequences  by  imposing  the  conditions: 

k  k_ 

I  vj  >  T  vj'  ,  where  1  <  k  <  n  (1) 

i=1  “  i=1  “ 


,2  -  X  *'■  <2> 

1  _i*1 

Whenever  these  conditions  are  satisfied,  sequence  V  is  said  to  precede  sequence 
V'. 

Such  criteria  were  first  introduced  into  the  chemical  literature  by  Gutman 
and  Randic  [30],  who  applied  them  to  the  ordering  of  alkane  isomers.  They  were 
able  to  show  that  a  complete  ordering  is  possible  for  all  such  isomers  having 
<  7,  whereas  for  ri  >  8  only  a  partial  ordering  can  be  achieved.  It  is  thus  not 
legitimate  to  compare  certain  pairs  of  isomeric  structures  having  jn  >  8  since 

Muirheacfs  conditions  [50]  are  not  fulfilled  in  all  cases.  The  three  structures 
illustrated  in  Figure  1  are  not  comparable,  for  instance,  since  for  all  _i_  we  have 
vj  =  vj',  that  is  to  say  the  sequence  to  be  compared  equals  32222111  for  each 
of  these  isomers.  Later  refinements  of  these  conditions  have  not  brought  any 
significant  improvement.  Thus,  the  generalized  conditions  of  Karamata  [40,7], 


l 


which  removed  the  restriction  that  only  integers  be  used  in  the  sequences,  certainly 
made  the  comparison  of  sequences  of  real,  nonintegral  numbers  possible,  though 
the  drawback  of  having  a  number  of  noncomparable  pairs  of  structures  in  the 
set  still  remained.  Randic  [59]  indicated  how  this  difficulty  might  be  alleviated 
to  some  extent  by  the  use  of  additional  information  in  the  form  of  several  new 
partial  sums  derived  for  the  sequences.  This  expedient,  however,  did  not 
satisfactorily  resolve  the  problem. 

From  a  different  vantage  point,  an  equivalent  approach  to  that  of  Randic 
[59]  has  emerged  in  recent  years.  In  a  fundamental  study  of  the  phenomena  of 
chirality  in  molecules,  Ruch  [79,77]  made  use  of  Young  diagrams  [89],  which 
were  subsequently  shown  to  have  relevance  not  only  in  the  interpretation  of 
chirality  but  in  several  other  areas  as  well,  including  the  study  of  molecular 
branching  [80,34].  When  used  for  this  latter  purpose,  Young  diagrams  are 
constructed  by  ordering  the  vertex  degrees  of  graphs  in  a  nonascending  sequence 
as  described  above.  The  graphs  are  then  depicted  by  arrays  of  square  boxes  in 
which  each  of  the  rows  represents  a  single  vertex  and  the  number  of  boxes  in 
a  given  row  is  determined  by  the  degree  of  the  relevant  vertex.  The  Young 
diagrams  for  the  three  isomers  shown  in  Figure  1  will  all  be  based  on  the  vertex 
sequence  322221T1  and  are  thus  all  identical,  as  is  apparent  from  Figure  2.  The 
fact  that  these  three  isomers  correspond  to  the  same  diagram  makes  the 
limitations  of  the  approach  manifest.  Clearly,  only  a  partial  ordering'will  be 
possible  by  this  means,  for  the  ordering  which  results  is  precisely  the  same  as 
that  attained  by  the  use  of  Muirheacfs  criteria  [78].  Accordingly,  there  is  no 


special  advantage  to  be  gained  by  adopting  this  particular  approach  to  ordering; 


we  shall  therefore  not  discuss  it  further  here. 
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A  more  promising  approach  to  the  ordering  of  graphs  was  put  forward  by 
Randic  and  Wilkins  [68],  who  used  paths  of  differing  lengths  as  the  basis  for 
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their  procedure  rather  than  vertex  degree's.  In  the  tree  graphs  of  alkane  species, 
the  enumeration  of  the  various  paths  present  in  the  graphs  is  straightforward. 
In  the  case  of  the  18  octane  isomers,  the  result  of  the  enumeration  is  presented 
in  tabular  form  in  Table  2.  For  reasons  of  convenience,  Randic  and  Wilkins  [68] 
ordered  these  isomers  in  terms  of  the  pair  of  numbers  (p2/  P3),  representing 
respectively  paths  of  lengths  two  and  three.  Strictly  speaking,  a  septuple  rather 
than  a  pair  should  have  been  used  to  account  for  all  the  paths  present,  though 
even  their  simplistic  approach  produced  a  surprisingly  good  ordering.  The  various 
isomers  were  positioned  on  a  grid  according  to  their  (p2,  P3)  values  as  illustrated 
in  Figure  3.  The  conditions  invoked  for  the  actual  ordering  were  that  two 
structures  were  comparable  only  if  P2  >  P2*  and  P3  <  P3';  whenever  these  conditions 
were  satisfied,  the  points  on  the  grid  corresponding  to  the  two  structures  were 
connected.  The  ordering  attained  by  this  method  is  again  only  a  partial  one, 
and  two  of  the  16  points  on  the  grid  correspond  to  structures  having  identical 
(P2,  P3)  values.  One  of  the  identical  pairs  is  the  3-methylheptane  and 
4-methylheptane  pair,  illustrated  in  Figure  1.  It  should  be  pointed  out,  however, 
that  if  all  paths  had  been  used  in  the  ordering  process,  a  complete  ordering  of 
all  the  18  isomers  would  have  been  possible  since  no  two  isomers  have  all  their 
path  length  sequences  identical.  The  approach  was  later  extended  to  the  sets 


of  alkane  isomers  having  n_  =  9  (the  35  nonanes)  [69]  and  n  =  10  (the  75  decanes) 
[64]  with  similar  results. 


All  of  the  methods  discussed  so  far  for  discriminating  among  isomers  have 
depended  upon  the  use  of  numerical  codes,  namely  upon  sequences  of  nonascending 
vertex  degrees  or  upon  sequences  of  path  numbers.  In  this  section  we  shall  briefly 
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examine  codes  which  provide  a  unique  characterization  of  species.  We  shall 
again  confine  the  discussion  to  alkane  molecules,  and  say  little  about  various 
other,  nonuiique  codes  which  have  been  put  forward  for  species  characterization. 
In  fact,  we  can  only  touch  upon  the  subject  here,  for  the  study  of  codes  covers 
so  vast  an  area  that  it  deserves  a  separate  review  by  itself.  Now,  it  is  widely 
recognized  [57]  that  the  use  of  some  standard  numbering  procedure  for  the  vertices 
of  graphs  would  render  the  problem  of  establishing  the  isomorphism  of  a  pair 
of  graphs  an  essentially  trivial  one.  Once  such  a  procedure  has  been  devised, 
each  of  the  graphs  may  then  be  represented  by  a  so-called  canonical  matrix  and 
this  permits  an  ordering  of  those  graphs  e.g.  by  lexicographical  ordering  of  the 
matrices. 

Since  the  adjacency  matrix  is  known  [18]  to  characterize  any  graph  it  represents 
up  to  isomorphism,  many  workers  have  focused  attention  on  this  particular  matrix. 
The  adjacency  matrix,  A(G),  which  may  be  defined  as  follows: 

A(G)  =  fan  =  0  (3) 

<  aij  =  0  (i,_j  £  e(G) 

/  ajj  =  1  (i,i  c  e(G)  , 

where  e(G)  is  the  edge  set  of  G,  can  be  written  out  in  the  form  of  a  binary  number 
by  reading  the  rows  sequentially  from  left  to  right  and  from  top  to  bottom. 
Standard  forms  of  presenting  A(G)  have  been  sought  which  would  yield  either 
the  maximum  or  the  minimum  binary  number  using  this  representation.  The 
problem  has  been  examined  from  a  variety  of  different  standpoints,  including 

those  of  Nagle  [52],  who  proposed  a  general  linear  ordering  relation  for  graphs 

to  derive  the  canonical  matrix;  Randic  [57,62],  who  devised  canonical  labeling 

schemes  for  graphs  based  upon  A(G)  and  who  went  on  to  apply  these  notions  to 
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the  study  of  topological  symmetry  [62,53];  El-Basil  and  coworkers  [23,22],  who 
utilized  codes  based  on  the  traces  of  A(G)Ji,  where  1  <  k_  <  £,  to  characterize 

both  cyclic  and  noncyciic  organic  molecules;  and  Herndon  and  Leonard  [36],  who 
extended  the  concepts  of  canonical  labeling  and  unique  linear  notation  to  organic 
and  inorganic  polyhedral  cluster  compounds. 

To  illustrate  the  types  of  code  which  can  be  derived  from  canonical  labeling, 
we  now  consider  the  approach  of  Randic  [57,62]  in  some  detail.  Since  any  graph 
on  in  vertices  will  have  a  total  of  nj  possible  labeiings,  the  three  tree  graphs  in 
Figure  1  will  have  8!  possible  labelings.  To  reduce  this  large  number,  some 
algorithm  is  necessary  to  devise  a  labeling  which  will  yield  a  binary  number  of 
minimum  value  without  screening  all  the  nj  possibilities.  Initially,  Randic  [62] 
suggested  that  the  labeling  be  obtained  simply  by  permuting  the  rows  and  columns 
of  A(G)  two  at  a  time,  starting  with  a  graph  having  arbitrary  labeling.  It  was 
later  demonstrated  by  MacKay  [451,  however,  that  such  a  procedure  can  result 
in  trapping  in  a  local  minimum,  and  is  thus  not  foolproof.  A  more  satisfactory 
procedure,  also  developed  by  Randic  [57],  involved  carrying  out  operations  on 
A(G)  to  ensure  that  its  first  row  would  have  the  maximum  number  of  zeros  in 
it  and  that  these  would  precede  ones  whenever  possible.  In  terms  of  graph 
labelings,  this  implies  that  the  smallest  label  (1)  should  have  as  its  immediate 
neighbors  vertices  bearing  the  largest  labels  (ri,  £  -  1,  etc.).  After  treating  the 
first  row  of  A(G)  in  this  way,  the  second  and  subsequent  rows  are  then  dealt 
with  in  the  same  manner.  In  general,  this  can  be  accomplished  without  difficulty, 
for  the  procedure  is  a  very  efficient  one  [57].  Examples  of  the  canonical  labelings 
and  resulting  codes  for  the  three  isomers  of  pentane  (C5H12)  are  depicted  in 

FlCrOHE  4 

Figure  4. 

We  conclude  this  section  by  making  brief  mention  of  a  newly  developed  unique 
code,  known  as  a  compact  code.  The  evolution  of  this  type  of  code  can  be  traced 


back  at  least  two  decades.  Hiz  [39]  introduced  the  idea  of  linearizing  chemical 
graphs  in  the  form  of  codes  called  ciphers,  which  omitted  all  extraneous  chemical 
information  pertaining  to  the  species  represented.  Knop  et  al.  [42]  developed 
those  ciphers  for  the  purpose  of  enumerating  the  classes  of  molecules  originally 
studied  by  Cayley  [163,  namely  the  alkanes,  CnH2n+2'  anc*  t*ie  substituted  alkanes, 
CnH2n+lX.  Recently,  Randic  [55]  demonstrated  how  these  ciphers  could  be 
adapted  to  the  labeling  of  various  molecules  having  tree  graphs.  The  compact 
code  is  constructed  by  locating  the  vertex  (vertices)  of  highest  degree  (degrees) 
and  then  writing  nonascending  vertex  degree  sequences  for  all  the  paths  emanating 
from  such  vertices.  The  various  sequences  are  concatenated  into  one  code 
according  to  their  lengths,  with  the  longest  being  written  down  first.  For  the 
three  alkane  isomers  in  Figure  1,  the  codes  now  differ  and  assume  the  forms 
32222111,  32221211,  and  32212211.  Not  only  is  the  code  useful  for  ordering 
species,  but  direct  reconstruction  of  the  chemical  species  represented  is  also 
possible,  for  a  1  can  be  interpreted  as  a  primary  carbon  atom  or  methyl  group 
(CH3);  a  2  as  a  secondary  carbon  atom  or  methylene  group  (CH2);  a  3  as  a  tertiary 
carbon  atom  or  a  methyne  group  (CH);  and  a  4  as  a  quarternary  carbon  atom 
(C). 

The  Use  of  Polynomials  and  Eigenvalues 

An  important  graph  invariant  now  being  increasingly  used  in  the 

characterization  of  molecular  branching  is  the  characteristic  polynomial,  Pg^' 
which  is  defined  as  (-1)H  det  |  A(G)  -  xE(G)  |  ,  where  E(G)  is  the  unit  matrix 
for  the  graph  G.  Various  methods  for  the  evaluation  of  Pq(x)  have  recently  been 
discussed  by  Randic  [613.  Although  this  polynomial  has  long  been  known  not  to 
provide  a  unique  characterization  of  graphs  [17],  it  has  remained  of  interest 


to  chemists  because  the  coefficients  of  Pq(x>  may  be  obtained  from  certain 
combinations  of  subgraphs  comprised  of  disjoint  edges  or  cycles  [80].  These 
subgraphs  are  clearly  related  to  the  numbers  of  random  and  self-returning  walks 
in  G,  and  also  to  the  nonadjacent  number  and  cycle  counts.  This  fact  led  Randic 
[56]  to  explore  the  idea  of  representing  Pc<x)  in  terms  of  summation*  of  the 
polynomials  of  paths  on  £  vertices,  Ln(x),  as  defined  in  equation  (4).  In  the  case 
of  the  three  isomers  in  Figure  1,  the  Pg(x)  assume  the  forms  (a)  L9  -  L5;  (b) 
L9  -  L5  -  L3;  and  (c)  L9  -  L5  -  L3  -  L-j.  The  coefficient  of  L5  was  found  to  reflect 
the  number  and  type  of  substitutions  occurring  on  the  main  chain:  for  a  methyl 
(CH3)  substitution  it  takes  the  value  -1;  for  methyl  substitutions  at  two  different 

atoms  -2;  for  dimethyl  substitution  on  the  same  atom  -3;  for  disubstitution  on 

one  atom  and  monosubstitution  on  another  atom  -4;  for  tetramethyl  substitution 
-5;  and  for  one  tetramethyl  substitution  and  two  other  single  substitutions  -6. 

An  explicit  closed  form  for  Ln(x)  polynomials  was  originally  presented  by 
Collatz  and  Sinogowitz  [17]  as  follows: 

[n/2] 

Ln(x)  =  V  (-llii  (  "  ’  - )  x  £  ’  2k  .  (4) 

--  k40  k  - 


From  this  expression,  it  is  evident  that  the  Ln(x)  may  be  written  as  Chebyschev 
polynomials  in  x/2.  Several  prescriptions  for  discerning  the  general  form  assumed 
by  the  Ln(x)  for  various  families  of  structures  based  on  the  graphs  of  alkane 
isomers  were  put  forward  by  Randic  [56].  These  prescriptions  were  generalized 
by  Hosoya  and  Randic  [37],  who  derived  a  number  of  closed  expressions  and  who 
pointed  out,  for  instance,  that  xH  can  be  formulated  as: 


xO  = 


[n/2] 

l 

k=0 


n  -  2k  +  1 
£+1 


n  +  1 

^  ,  )  *-n  -  2k  . 

k  —  — 


(5) 


The  regrouping  of  the  terms  in  these  expansions  of  Pq(x)  renders  the  patterns 
for  the  individual  coefficients  obvious  in  many  cases.  Thus,  by  focusing  attention 
on  families  of  structurally  related  graphs,  it  is  possible  to  utilize  Pq(x)  for  the 
purpose  of  characterizing  such  graphs.  The  expansion  based  on  Chebyshev 
polynomials  can  also  be  used  in  those  cases  in  which  P<3(x)  exhibits  sets  of 
identical  spectra  for  pairs  of  nonisomorphic  graphs. 

Isospectral  graphs  have  been  investigated  for  many  years  and  numerous 
references  can  be  cited  in  the  mathematical  [17,32,83,90]  and  chemical  [61,91,65] 
literature.  Such  graphs  have  received  attention  from  chemists  principally  because 
the  eigenvalues  of  chemical  graphs  correspond  to  the  quantum-mechanically 
allowed  energy  levels  within  the  species  represented  [46].  Our  interest  here, 
however,  stems  from  the  observation  of  several  authors  [17,44,31,11]  that  the 
degree  of  branching  in  a  graph  is  closely  related  to  its  maximum  eigenvalue, 
M,  frequently  referred  to  as  the  spectral  radius.  Cvetkovic  and  Gutman  [19] 
were  the  first  to  demonstrate  that  XT  may  be  expressed  in  terms  of  the  total 
number,  w(kT,  of  walks  of  length  j<  in  a  graph  G  by  means  of  the  following 
approximate  formula: 


w(k)  =  n  (  xi>-  •  (6) 

The  approximation  becomes  an  equality  only  in  the  case  of  regular  graphs. 

The  result  in  equation  (6)  represents  an  interesting  relationship  between  a 
spectral  property  (  A-j)  and  a  combinatorial  property  (w(k»  of  a  graph,  and  thereby 
confirms  the  empirical  finding  that  X  t  provides  a  reliable  measure  of  branching 
in  molecular  graphs.  Moreover,  since  At  satisfies  the  inequality: 
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Vmax  ' 
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it  may  be  interpreted  L19J  as  a  kind  of  mean  vertex  degree  for  the  graph  G.  Lovasz 
andPelikan  [44]  were  able  to  prove  that  if  all  trees  on  £  vertices  are  ordered 
according  to  their  A-j  value,  the  path  will  occupy  the  first  position  (minimal 
A-])  whereas  the  star  will  occupy  the  final  position  (maximal  *-j)  in  the  sequence. 
The  three  molecules  illustrated  in  our  Figure  1  are  differentiated  in  terms  of 
their  respective  A-]  values,  having  the  values  (a)  1.950,  (b)  1.989,  and  (c)  2.000. 
It  should  be  reiterated,  however,  that  neither  A  -j  nor  the  complete  set  of 
eigenvalues  {A  n}  offers  a  unique  characterization  of  G.  Thus,  the  graphs  of 
3-ethylpentane  and'2,4-dimethylpentane  have  identical  A-j  values,  and  the  two 
graphs  illustrated  a*,  the  top  of  Figure  5  have  identical  sets  of  eigenvalues,(  An  }. 

Graph  Invariants  as  Branching  Descriptors 
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In  the  discussion  of  graph  invariants  many  of  the  diverse  lines  of  thought 
introduced  above  find  their  natural  intersection.  We  are  concerned  here  only 
with  those  invariants  which  have  been  used  specifically  for  the  correlation  of 
molecular  structures  with  physicochemical  properties.  Although  a  wide  variety 
of  invariants  has  been  employed  for  this  purpose  over  the  past  four  decades, 
it  is  only  during  the  last  few  years  that  their  great  importance  to  chemistry 
has  been  fully  appreciated  [54].  Nowadays,  graph  invariants  are  usually  referred 
to  in  the  chemical  literature  as  topological  indices;  for  convenience,  we  shall 
refer  to  them  here  simply  as  indices.  In  recent  years  a  steady  stream  of  indices 
has  emerged,  allegedly  providing  an  increasingly  reliable  characterization  of 
molecular  branching.  We  shall  focus  especially  on  the  newer  indices  and  the 
claims  made  for  them,  for  it  is  neither  feasible  nor  appropriate  here  to  review 
comprehensively  the  vast  field  of  topological  indices;  interested  readers  are 
referred  to  several  detailed  reviews  on  the  subject  [11,49,74,2]. 
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The  first  graph  invariant  to  be  used  in  chemistry  was  introduced  in  1947  by 
Wiener  [88],  and  is  commonly  referred  to  nowadays  as  the  Wiener  index,  W(G). 
The  index  was  originally  defined  as  the  sum  of  the  chemical  bonds  existing  between 
all  pairs  of  carbon  atoms  in  a  molecule,  and  later  shown  [38]  to  be  equal  to  one 
half  the  sum  of  the  entries  in  the  relevant  distance  matrix,  i.e.: 


W(G) 


l  l 

i=i  1=1 


(G) . 


(8) 


W(G)  has  been  widely  used  to  model  the  physicochemical  properties  of  chemical 
species,  such  as  boiling  point  and  refractive  index  [76].  Although  the  index  gives 
good  correlations  for  species  having  unbranched  graphs,  when  branched  species 
are  included  the  results  are  not  nearly  as  satisfactory.  This  is  well  illustrated 
by  the  plot  in  Figure  6,  which  reveals  the  wide  scatter  in  the  points  for  the  75 
decanes  when  the  boiling  point  is  plotted  against  W(G).  In  fact,  the 

correlation  coefficient  for  linear  regression  turns  out  to  be  only  0.0035!  Moreover, 
W(G)  is  associated  with  a  fairly  high  level  of  degeneracy;  a  pair  of  trees  having 
identical  W(G)  values  are  shown  in  Figure  5. 

Since  the  time  of  Wiener,  strenuous  endeavors  have  been  made  to  devise 
better  indices  than  W(G).  The  first  major  advance  came  in  1971  when  Hosoya 
[38]  introduced  an  index  of  the  form: 


[n/2] 
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Z(G) 


p(G,k)  , 


(9) 


-17- 


where  p(G,k_)  is  the  number  of  ways  in  which  k_  edges  can  be  chosen  from  G  such 
that  no  two  of  them  are  adjacent;  by  definition  p(G,0)  =  1  and  p(G,1)  =  ne,  the 
number  of  edges  in  G.  For  trees,  the  characteristic  polynomial  and  Z(G)  are 
interrelated,  and  this  polynomial  can  be  expressed  in  terms  of  the  P(G,  j<)  as 
follows: 

tn/2] 

PG=T(x)  -  l  <-l£  p(G,  k)  xH-2L  .  (10) 

k=0 

Gutman  [29]  has  shown  that  Z(G)  is  particularly  well  suited  to  reflect  the 

alternations  in  boiling  point  in  monomethyl  alkanes  as  the  methyl  group  is  displaced 
along  the  main  carbon  chain.  The  index  suffers  from  the  drawback,  however, 
that  it  too  displays  a  high  level  of  degeneracy  i.e.  it  is  far  from  being  one-to-one, 
for  the  classes  of  graphs  of  interest  here. 

The  first  index  specifically  designed  to  be  of  low  degeneracy  was  the  molecular 
connectivity  index  of  Randic  [58],  This  has  proved  to  be  a  highly  successful  index 
in  that  it  is  the  most  widely  used  of  all  indices  propounded  so  far;  moreover  it 
is  the  only  index  to  have  had  a  whole  book  devcted  to  it  [41].  The  index  was 

designed  with  the  intention  of  characterizing  branching  in  chemical  species  and 

is  based  on  the  notion  of  edge  types  in  molecular  graphs.  An  edge  is  said  to  be 
of  type  (v-|,  V2)  if  the  two  end  vertices  of  the  edge  have  degrees  v-]  and  V2 
respectively.  In  formal  terms,  the  index  may  be  defined  by  the  relationship: 

ne 

X  =  l  (v-,vj)-7  ,  (11) 

e=1 

where  the  summation  extends  over  all  e  edges,  and  ne  is  the  total  number  of 
edges  in  G.  To  date,  the  index  has  been  used  in  a  vast  number  of  correlations 
ranging  from  the  prediction  of  physicochemical  properties  to  the  design  of  drugs 
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[41].  Its  degeneracy  is  moderately  low  [723. 

Because  of  the  low  degeneracy  of  X(G)  and  its  great  value  in  correlational 
studies,  several  attempts  have  been  made  to  extend  its  range  of  usefulness.  The 
first  proposal  put  forward  [41]  envisaged  summing  over  paths  of  different  lengths 
instead  of  choosing  paths  of  length  one.  This  idea  led  to  the  introduction  of 
a  whole  range  of  X  (G)  indices  designated  as  0  x  (G),  "'x  (G),  ^x  (C),  ^  x(G),  etc. 
for  paths  of  length  zero,  one,  two,  three,  etc.  The  index  defined  above  should 
thus  more  correctly  be  referred  to  as  1  x  (G).  The  generalized  index,  ilx  (G)  may 
be  defined  by  the  equation: 

llX(G)  =  I  [v-]{tt  )v2(tc)  ....  V|1+-](tt  )]“?  ,  (12) 

IT  ■— 

where  *  extends  over  all  paths  of  length  h_  and  Vj(10  denotes  the  valence  of  the 
jth  vertex  on  path  with  1  _<  j_£  h+1.  The  ilx  (G)  index  is,  of  course,  also  derivable 
from  the  hth  power  of  A(G).  Each  of  the  ilx  (G)  will  give  a  different  weighting 
for  the  contributions  made  by  primary  (CH3),  secondary  (CH2),  tertiary  (CH), 
and  quarternary  (C)  carbon  atoms.  The  basic  purpose  of  such  indices  is  to  give 
prominence  to  the  contributions  from  adjacent  and  nearby  atoms  (vertices)  and 
to  deemphasize  those  which  are  further  away,  in  accord  with  chemical  intuition. 
The  il  x  (G)  indices  have  moderately  low  degeneracies,  and  ^X  (G)  correlates  highly 
(0.98)  with  W(G)  [58]. 

Further  means  of  elaborating  such  indices  have  also  been  examined  in  the 
chemical  literature.  Thus,  Balaban  [68]  put  forward  an  index  known  as  the  distance 
sum  connectivity  index,  J(G),  which  is  defined  as  follows: 

ne  v  (6j  6;)~ 2 
ne-n-2  1  ~ 


J(G) 


(13) 
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where  ne  is  the  number  of  edges  in  G,  and  6j  represents  the  sum  of  the  entries 
in  the  _i_th  row  of  the  distance  matrix  D(G)  for  G.  The  degeneracy  of  J(G)  has 
been  shown  to  be  very  low;  in  alkane  graphs  the  first  degenerate  pair  encountered 
has  ri  =  12  vertices  [51.  Following  several  earlier  studies  on  the  characterization 
of  graph  vertices  in  terms  of  their  path  numbers  [54,58,91/  Randic  [601  proposed 
combining  the  x(G)  with  path  numbers.  This  resulted  in  an  index  having  a  very 
low  degeneracy  known  as  the  molecular  identity  number,  MID.  The  first  pair 
of  alkane  trees  with  identical  MID  numbers  has  16  vertices  [631. 


Approaching  the  Ultimate  Goal 


The  success  in  developing  ever  more  discriminating  indices  with  lower  and 
lower  degeneracies,  has  prompted  several  researchers  to  pose  the  question:  can 
a  simple,  graph-theoretical,  numerical  descriptor  be  derived  which  will  be  unique, 
at  least  for  the  classes  of  graphs  of  interest  to  chemists?  Although  much  progress 
has  been  made  on  the  difficult  task  of  characterizing  alkane  trees  uniquely  by 
means  of  such  an  index,  this  ultimate  goal  seems  to  be  a  very  elusive  one. 
Numerous  conjectures  put  forward  over  the  years  postulating  that  certain  indices 
--  including  the  Randic  MID  number  --  were  unique  have  subsequently  been  proved 
to  be  invalid  [85,87],  In  spite  of  this,  new  conjectures  continue  to  be  made.  For 
instance,  it  has  recently  been  conjectured  [1]  that  if  distance  sums  and  path 
numbers  were  used  in  the  MID  number  instead  of  vertex  degrees  and  path  numbers, 
the  degeneracy  would  vanish.  The  search  for  unique  indices  will  almost  certainly 
be  continued  for  many  years  to  come.  Below  we  touch  upon  some  of  the  more 
novel  approaches  which  have  been  explored  recently  and  which  are  claimed  to 
lead  to  highly  discriminating,  if  not  unique,  descriptors  for  alkane  tree  graphs. 
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The  use  of  random  walks  on  trees  has  been  investigated  by  Randic  et  al.  [70] 
and  Barysz  and  Trinajstic  [6].  The  former  workers  used  random  walks  to 
characterize  graphs  by  enumerating  all  the  walks  for  every  individual  vertex. 
Attempts  were  then  made  to  decide  which  factors  were  critical  in  determining 
the  walk  counts,  and  to  locate  isospectral  vertices  in  graphs.  Unusual  walks, 
i.e.  walks  for  nonequivalent  sites  which  have  the  same  counts,  are  of  fundamental 
importance  in  the  study  of  isospectral  graphs.  These  facts  were  exploited  by 
the  latter  workers  to  establish  a  1-1  correspondence  between  trees  and  a  code 
called  the  ordered  structural  code  [6].  This  code  distinguishes  even  isospectral 
graphs.  The  code,  which  is  claimed  to  be  unique,  can  be  used  for  calculating 
the  coefficients  of  the  characteristic  polynomials  of  trees  and  for  demonstrating 
the  dependence  of  the  spectral  moments  on  the  various  tree  structures.  Spectral 
moments  are  obtained  by  summing  the  diagonal  elements  of  (A(G))Ji  for  each 
k,  and  correspond  directly  to  the  count  of  all  self-returning  walks  of  length  k^ 
in  a  given  molecular  graph. 

Information  theory  has  played  a  role  in  the  development  of  new  topological 
indices  for  many  years.  Recently,  a  book  devoted  solely  to  this  subject  has 
appeared  [10].  One  of  the  most  successful  information-theoretical  index  in  terms 
of  its  discriminating  power  is  the  so-called  mean  information  on  distance  equality 
index,  defined  as  follows  [13]: 


I  (G)  =  y  -_£ 

D  -  n(n-1) 

*  =  1 - 


[---■- n-1 

n(n-1) 


where  the  distance  J,  appears  2j££  times  in  the  distance  matrix  D(G)  for  the  graph 
G  and  rn  is  the  greatest  value  of_£_.  Another  very  successful  index  is  the  so-called 
graph  distance  complexity  index  advanced  by  Raychaudhury  et  al.  [71],  which 
is  based  on  an  average  information  measure  for  G.  Even  though  both  display 
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high  discriminatory  power  for  alkane  trees,  it  was  postulated  by  Bonchev  et  al. 
[13]  that  an  effective  practical  solution  to  the  problem  of  discrimination  would 
be  the  introduction  for  a  super  index.  Such  an  index  is  simply  a  sum  of  several 
separate  topological  indices.  Using  a  superindex  based  on  six  topological  indices, 
Bonchev  et  a!.  [13]  achieved  complete  discrimination  of  a  set  of  427  graphs  of 
chemical  interest. 

An  index  which  is  of  extremely  low  degeneracy,  and  which  has  the  advantage 
of  being  easily  obtainable,  is  based  on  the  hierarchically  ordered  extended  vertex 
connectivities  in  G.  The  algorithm  used  to  calculate  the  index,  commonly  referred 
to  as  the  HOC  algorithm  [47],  starts  with  a  partitioning  of  the  vertices  of  G 
into  equivalence  classes  according  to  their  degrees.  Additional  discrimination 
is  built  into  each  of  the  classes  by  means  of  vertex  extended  connectivities, 
i.e.  sums  of  vertex  degrees  of  the  nearest  neighbors.  For  equal  extended 
connectivities,  further  discrimination  is  introduced  via  the  sequences  of  the 
degrees,  arranged  in  ascending  order.  The  newly  formed  equivalence  classes 
are  assigned  ranks  that  increase  with  the  extended  connectivities  and  their  ordered 
summations.  These  ranks  are  then  used  for  the  iterative  recalculation  of  the 
extended  connectivities;  the  whole  procedure  is  terminated  when  the  same  ranks 
appear  after  two  consecutive  steps.  The  approach  represents  a  natural  extention 
of  Morgan's  [48]  algorithm  which  forms  the  basis  of  Chemical  Abstracts  coding 
system,  and  is  analogous  to  a  procedure  of  Randic  and  Wilkins  [67],  based  on 
sequences  of  path  numbers,  in  that  it  can  be  employed  for  the  recognition  of 
structural  similarity  in  molecular  graphs. 

The  Chemical  Ordering  of  Branched  Species 

Our  earlier  discussion  has  revealed  that  molecular  graphs  can  in  general  be 


partially  ordered  in  purely  mathematical  terms  according  to  their  degree  of 
branching.  Moreover,  if  the  two  graphs  and  G|<  within  a  given  class  (here 
alkane  graphs)  can  be  associated  with  the  numbers  and  r|<  in  such  a  way  that 
rj_  >  rk  whenever  it  is  decided  Gj^ is  more  branched  than  G|<,  a  measure  of  the 
branching  is  implied.  We  may  now  enquire  whether  such  a  measure  accords  with 
the  ordering  of  these  graphs  based  on  the  observed  physicochemical  properties 
of  the  molecules  concerned.  If  the  graphs  in  a  particular  class  of  graphs  are 
associated  with  so  many  different  values  of  a  property  over  a  given  interval 
that  they  may  be  interpreted  as  representing  a  continuum  of  properties,  the 
theorem  of  Karamata  [40]  can  be  applied.  Karamata's  theorem,  which  is  valid 
for  continuous  and  convex  functions  defined  on  a  sequence  of  numbers,  permits 
conclusions  to  be  drawn  concerning  the  relative  magnitudes  of  the  function  if 
the  relative  magnitudes  for  the  terms  in  the  sequence  are  known.  Using  some 
well-selected  subgraph  structures  to  yield  a  numerical  sequence,  it  thus  becomes 
possible  to  predict  the  relative  magnitudes  of  molecular  properties  of  interest. 
This  was  accomplished  by  Randic  et  al.  [69,64,68]  for  the  alkane  isomers  up  to 
_n  =  10.  The  selected  graph  invariants  were  paths  of  different  lengths  (especially 
of  lengths  two  and  three)  and  the  physicochemical  parameters  ranged  from  boiling 
points  through  thermodynamic  properties  to  refractive  indices.  In  all  cases  the 
trends  established  by  mathematical  ordering  corresponded  with  those  based  on 
physicochemical  properties.  This  demonstrated  that  grid  diagrams  represent 
a  convenient  device  for  the  ordering  and  prediction  of  properties,  and  established 
the  significance  of  the  underlying  conceptual  framework.  Apparent  inconsistencies 
or  errors  in  the  raw  data  are  clearly  revealed  using  this  approach  [68]. 

A  different  way  of  interpreting  the  behavior  of  branched  alkane  species  is 
that  based  on  the  additive  nature  of  most  of  their  physicochemical  properties. 
This  way  has  been  exhaustively  investigated  by  Gordon  and  Kennedy  [27,25], 
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who  postulated  the  idea  of  expressing  all  measurable  parameters  of  a  chemical 
system  in  terms  of  a  linear  combination  of  graph-theoretical  invariants.  Such 
a  derived  parameter,  M,  can  be  represented  by  the  summation: 

M  =  l  cq  Nj  (15) 

i  ~  ~ 

where  the  aj  are  coefficients,  and  the  Nj  are  appropriate  graph  invariants.  This 
simple  formulation  effectively  summarizes  all  the  manifold  additivity  schemes 
which  have  been  proposed  in  the  chemical  literature  over  the  past  century  [27]. 
It  should  be  borne  in  mind,  however,  that  the  approach  is  a  purely  graph-theoretical 
one  and  that  properties  governed  by  stereospecificity  or  precise  geometry  will 
be  beyond  its  scope.  Even  with  this  restriction,  the  value  of  equation  (15)  is 
beyond  doubt,  for  it  has  been  established  [25]  that  each  parameter  analyzed  in 
this  way  becomes  stable  to  the  introduction  of  further  invariants  beyond  a  certain 
point.  The  stable  values  can  readily  be  calculated  and  used  for  comparisons 
of  properties  derived  from  mathematical  ordering. 

The  Nj  in  equation  (15)  are,  of  course,  topological  indices  and  some  of  the 
indices  mentioned  above  have  been  employed  in  this  type  of  analysis.  In  particular, 
paths  of  different  lengths  have  been  widely  featured  [86].  Trends  in  more  complex 
topological  indices  with  branching  have  also  been  presented  by  several  workers. 
Thus,  for  the  Wiener  index,  W(G),  Bonchev  and  Trinajstic  [14]  have  given  detailed 
mathematical  expressions  for  the  variation  in  the  value  of  W(G)  with  the  differing 
types  of  branching  encountered  in  alkane  species.  In  the  case  of  the  Hosoya 
index,  Z(G),  a  composition  principle  was  given  [38]  from  which  it  was  apparent 
that  Z(G)  depends  on  certain  subgraphs  of  G  for  alkane  isomers.  Randies  molecular 
connectivity  indiem,  ]lx  (G),  have  also  been  investigated  [41,72]  with  a  view 

to  interpeting  their  dependence  on  various  graph  invariants.  In  general,  however, 
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topological  indices  do  not  give  good  correlations  with  the  physicochemical 
properties  of  branched  species. 

In  an  attempt  to  overcome  this  problem  with  topological  indices,  Bonchev 
and  Mekenyan  [12]  introduced  the  concept  of  the  comparability  graph  for  the 
ordering  of  alkane  and  other  isomers.  The  comparability  graph  is  constructed 
for  a  complete  set  of  isomers  by  making  use  of  known  rules  of  structural 
complexity,  e.g.  those  put  forward  by  Bonchev  and  Trinajstic  [14].  Each  rule 
serves  to  partially  order  the  isomers  by  expressing  trends  which  occur  in  various 
topological  indices  as  systematic  changes  to  the  structure  of  the  isomers  are 
made.  In  such  graphs,  the  vertices  correspond  to  individual  isomers  and  the 
directed  edges  to  isomer  interconversions.  The  paths  in  these  oriented  graphs 
specify  the  ordering  of  the  vertices;  isomers  associated  wi*h  different  graph 
paths  are  taken  to  be  noncomparable.  Combined  comparability  graphs  based 
on  several  different  topological  indices  were  set  up  for  alkane  isomers  with  ji 
=  7  (the  heptanes)  and  n  =  8  (the  octanes),  including  a  total  of  20  physicochemical 
properties.  The  majority  of  properties  followed  the  predicted  ordering;  those 
showing  the  greatest  deviations  were  the  critical  temperature,  the  Antoine 
equation  coefficient,  surface  tension,  molecular  volume  density,  molecular 
refraction  and  refractive  index.  These  properties  may  well  depend  on 
graph-theoretical  ^actors  not  included  in  the  invariants  used  in  constructing 
the  comparability  graph,  and  also  on  stereochemical  and  geometrical  effects. 
A  similar  approach  based  on  the  degree  of  structural  similarity  of  pairs  of  isomers 
has  recently  been  put  forward  by  Grossman  [28]. 

Conclusion 


The  problem  of  characterizing  branching  in  a  completely  satisfactory  way 


to  the  physical  scientist  is  likely  to  remain  unsolved  for  the  forseeable  future. 
The  two  main  reasons  for  this  are  that  (i)  the  notion  of  branching  is  an  essentially 
intuitive  one,  and  (ii)  in  general  different  physicochemical  properties  seem  to 
require  different  orderings  of  sets  of  isomers.  Thus,  in  spite  of  many  highly 
ingenious  approaches  to  the  quantification  of  branching  in  molecular  species, 
only  a  partial  ordering  can  be  attained  in  most  cases.  Such  partial  orderings 
are  based  on  mathematical  criteria  such  as  those  of  Muirhead  [50],  and  are 
appropriate  for  certain  physicochemical  properties,  but  by  no  means  all  of  them. 
The  latter  properties  are  probably  not  dependent  to  the  same  extent  on  the 
molecular  connectivity  as  the  former,  and  in  addition  may  also  be  strongly 
influenced  by  geometric  or  stereochemical  factors.  At  present  it  is  not  possible 
to  characterize  molecular  graphs  uniquely  in  terms  of  graph  invariants,  but  several 
invariants  have  been  showr  to  possess  high  discrimination  ability.  Codes,  however, 
based  on  the  adjacency  matrix,  A(G),  of  the  graph  are  able  to  provide  unique 
characterizations  of  molecular  graphs,  although  these  are  rather  unwieldy  and 
therefore  unsuitable  for  most  chemical  correlations. 
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Captions  for  Tables 


Table  1.  Number  of  alkane  constitutional  isomers  (trees)  for  various  values  of 
n,  the  number  of  carbon  atoms. 

* 

Table  2.  Number  of  paths  of  length  j_  with  (1  <  j.  <  7)  for  the  alkane  isomers  having 
£  =  8  (the  octanes)j 

Captions  for  Figures 

Figure  1.  Hydrogen-suppressed  graphs  of  the  octane  isomers  (a)  2-methylheptane, 
(b)  3-methylheptane,  and  (c)  4-methylheptane. 

r 

+ 

Figure  2.  The  nonascending  vertex  degree  sequence  and  Young  diagram  for 
each  of  the  three  isomers  in  Figure  1. 

7 

Figure  3.  Grid  of  the  18  octane  isomers  showing  an  ordering  based  on  the  number 
of  paths  of  length  two  (p2>  and  of  length  three  (P3)  in  each. 

Figure  4.  Canonical  labelings  and  canonical  codes  for  the  three  pentane  isomers. 

% 

Figure  5.  Pairs  of  hydrogen-suppressed  alkane  graphs  displaying  identical  indices 
of  varying  kinds. 

Figure  6.  A  scatter  plot  of  boiling  point  against  Wiener  index,  W(G),  for  the 


75  decane  isomers. 
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Table  1:  Number  of  alkane  constitutional  isomers  (trees)  for  various 
values  of  n,  the  number  of  carbon  atoms. 
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Name  of  Molecule  Graph 
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